Handle single manager reload by having workers reconnect#1775
Conversation
| // kick in. Because in a single manager cluster manager's gossip can't be | ||
| // bootstrapped unless some other node connects to it. | ||
| if nEvent.Type == NodeEventTypeLeave { | ||
| for _, ip := range nDB.bootStrapIP { |
There was a problem hiding this comment.
I would trigger this logic only in case len(nDB.bootStrapIP) == 1 to cover the specific case, else we ends up to retry to connect also in the case of 2 or more managers
There was a problem hiding this comment.
I was considering making it specific to the one manager case. But in the multiple manager case also workers connecting to a restarting manager should be ok, isn't it ? It will help in that manager node converging faster.
| // time. | ||
| nDB.networkClock.Witness(nEvent.LTime) | ||
|
|
||
| n := nDB.getNode(nEvent) |
There was a problem hiding this comment.
I wonder if we can just search if the node is in the nodes list and ignore it if it is in the leftNodes and failedNodes
There was a problem hiding this comment.
Currently checkAndGetNode gets called in handleNodeEvent. This function checks all the lists. Since we are adding getNode before checkAndGetNode its safer to keep the same search logic. Making it too restrictive (ie: searching only in nDB.nodes) and bailing out early can break some other scenario thats currently working.
Also, in this particular issue its ok even if the node is in the failedNodes. It can happen if there is out of order events and we have already handled NotifyLeave
|
Discussed yesterday with @sanimej and we concluded that at the moment this logic is the safest approach. @thaJeztah just FYI as a follow up of our previous discussion. This means that after this change you will see that logs for 24h (every 1s for the manager IPs) LGTM, would be ideal to handle the explicit leave with no retry but this change a part from some log spam won't affect functionality |
|
Unrelated to this PR. But I think retry every second is a bit too aggressive. That can be changed and also rate-limit the log message to one in every few mins or so. Will make that change in a separate PR. |
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
|
LGTM |
Related to moby #33346
In a cluster with one manager if that manager node is rebooted or daemon is restarted the manager node gets isolated from the gossip cluster. This is because in the single manager case its the one bootstrapping the gossip when other worker nodes connect to it. When that node restarts the libnetwork agent has no other node info to bootstrap the gossip.
Change in this PR is to have the workers try reconnecting to the bootstrapping manager periodically. This uses the already available reconnect logic which was originally added for the worker node, ie: if a worker node is isolated in the gossip cluster because of lets say network congestion when it thaws it will connect to the previously connected nodes to rebuild its gossip state. Normally this won't kick in when a node intentionally leaves the cluster which will be case in this scenario. This change is to treat it similar to gossip disconnect scenario so that the worker nodes can retry.
Signed-off-by: Santhosh Manohar santhosh@docker.com